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CATALYTIC DNA 
Backgro und of the Invention 
This invention relates to DNA molecules having 
5 catalytic activity and methods of obtaining and using 
such DNA molecules. 

Ribozymes are highly structured RNA molecules that 
carry out specific chemical reactions (e.g., cleavage of 
RNA, cleavage of DNA, polymerization of RNA, and 
10 replication of RNA) , often with kinetic efficiencies 
comparable to those of most engineered enzymes. 

Sum mary of the Invention 
The invention features nucleic acid molecules 
having catalytic activity, as well as methods for 
15 obtaining and using such nucleic acid molecules. 

The methods of the invention entail sequential 
in vitro selection and isolation of nucleic acid 
molecules having the desired properties (e.g., catalytic 
activity, such as ligase activity) from pools of single- 
20 stranded nucleic acid molecules (e.g., DNA, RNA, or 

modifications or combinations thereof) containing random 
sequences. The isolated nucleic acid molecules are then 
amplified by using, e.g., the polymerase chain reaction 
(PCR) . 

25 Th * rounds of selection and amplification may be 

repeated one or more times, after each round, the pool of 
molecules being enriched for those molecules having the 
desired activity. Although the number of desired 
molecules in the initial pool may be exceedingly small, 

30 the sequential selection scheme overcomes this problem by 
repeatedly enriching for the desired molecules. 

The pool of single-stranded nucleic acid molecules 
employed in the invention may be referred to as "random 
nucleic acid molecules" or as containing "random 
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sequences . " These general terns are used to describe 
molecules or sequences which have one or more regions of 
•fully random sequence. n In a fully random sequence, 
there is an approximately equal probability, of A, T/U, C, 
5 or G being present at each position in the sequence. Of 
course, the limitations of some methods used to create 
nucleic acid molecules make it rather difficult to 
synthesize fully random sequences in which the 
probability of each nucleotide occurring at each position 

10 is absolutely equal. Accordingly, sequences in which the 
probabilities are roughly equal are considered fully 
random sequences. 

In "partially random sequences" and "partially 
randomized sequences, 91 rather than there being a 25% 

15 chance of A, T/U, C, or G being present at each position, 
there are unequal probabilities. For example, in a 
partially random sequence, there may be a 70% chance of A 
being present at a given position and a 10% chance of 
each of T/U, C, or G being present at that position. 

20 Further, the probabilities can be the same or different 
at each position within the partially randomized region. 
Thus, a partially random sequence may include one or more 
positions at which the sequence is fully random, one or 
more positions at which the sequence is partially random, 

25 and/ or one or more positions at which the sequence is 
defined. 

Partially random sequences are particularly useful 
when one wishes to make variants of a known sequence. 
For example, if one knows that a particular 50 nucleotide 

30 sequence possesses a desired catalytic activity and that 
positions 5, 7, 8, and 9 are critical for this activity, 
one could prepare a partially random version of the 
50 nucleotide sequence in which the bases at positions 5, 
7, 8, and 9 are the same as in the catalytically active 

35 sequence, and the other positions are fully randomized. 
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Alternatively, one could prepare a partially random 
sequence in which positions 5, 7, 8, and 9 are partially 
randomized, but with a strong bias towards the bases 
found at each position in the original molecule, with all 
5 of the other positions being fully randomized* This type 
of partially random sequence is desirable in pools of 
molecules from which catalytic nucleic acids are being 
selected. The sequence of any randomized region may be 
further randomized by mutagenesis during one or more 

10 amplification steps. 

In addition to random or partially random 
sequences, it may also be desirable to have one or more 
regions of "defined sequence. 91 A defined sequence is a 
sequence selected or known by the creator of the 

15 molecule. Defined sequence regions are useful for 
isolating or PGR amplifying the nucleic acid molecule 
because they may be recognized by defined complementary 
primers. The defined sequence regions may flank the 
random regions or be intermingled with the random 

20 regions. The defined regions can be of any length 

desired and are readily designed using knowledge in the 
art (see, for example, Ausubel et al., Current Protocols 
in Molecular Biology. Greene Publishing, New York, New 
York (1994); Ehrlich, PGR Technology . Stockton Press, New 

25 York, New York (1989); and Innis et al., PCR Protocols. A 
Guide to Methods and Applications. Academic Press, Inc., 
San Diego, CA (1990)). 

The selection method of the invention involves 
contacting a pool of nucleic acid molecules containing 

30 random sequences with the substrate for the desired 
catalytic activity under conditions (including, e.g., 
nucleic acid molecule concentrations, temperature, pR, 
and salt) which are favorable for the catalytic activity. 
Nucleic acid molecules having the catalytic activity are 

35 partitioned from those which do not, and the partitioned 
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nucleic acid molecules having the catalytic activity then 
are amplified using, e.g., PGR. 

The steps of contacting, partitioning, and 
amplifying may be repeated any desired number of times. 
5 Several cycles of selection (contacting, partitioning, 
and amplifying) may be desirable because after each round 
the pool is more enriched for the desired catalytic 
nucleic acids. One may choose to perform so many cycles 
of selection that no substantial improvement in catalytic 
10 activity is observed upon further selection, or one may 
carry out far fever cycles of selection. 

Methods known in the art may be used at particular 
steps of this selection and isolation procedure, and one 
skilled in the art is referred to Ellington and Szostak, 

15 Nature 346:818-822, 1990; Lorsch and Szostak, Nature 
371:31-36, 1994; Tuerk and Gold, Science 249:505-510, 
1990; and methods described herein. 

In addition, one may mutagenize isolated catalytic 
nucleic acids in order to generate and subsequently 

20 isolate molecules exhibiting improved catalytic activity. 
For example, one may prepare degenerate pools of single- 
stranded nucleic acids based on a particular catalytic 
nucleic acid sequence, or one may first identify 
important regions in a catalytic nucleic acid sequence 

25 (for example, by standard deletion analysis) , and then 
prepare pools of candidate catalytic nucleic acid 
molecules that include degenerate sequences at those 
important regions. 

Those skilled in the art can readily identify 

30 catalytic nucleic acid consensus sequences by sequencing 
a number of catalytic nucleic acid molecules and 
comparing their sequences. In some cases, such 
sequencing and comparison will reveal the presence of a 
number of different conserved sequences, in these 

35 circumstances, one may identify a core sequence which is 
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common to most or all of the isolated sequences. This 
core sequence # or variants thereof, may be used as the 
starting point for the selection of improved catalysts. 
By "variant" of a sequence is meant a sequence created by 
5 partially randomizing the sequence. 

The size of the randomized regions employed should 
be adequate to provide a catalytic site. Thus, the 
randomized region used in the initial selection 
preferably includes between 10 and 300 nucleotides, for 

10 example, between 25 and 180 nucleotides. 

It may be desirable to increase the stringency of 
a selection step in order to isolate more molecules. The 
stringency of the selection step may be increased by 
decreasing substrate concentration. The stringency of 

15 the catalysis selection step can be increased by 

decreasing the ligand concentration or the reaction time. 

In one aspect, therefore, the invention features a 
method for obtaining a nucleic acid molecule having 
ligase activity. In the first step of this method, a 

20 population of candidate nucleic acid molecules, each 
having a region of random sequence, is contacted with a 
substrate nucleic acid molecule and an external template. 
The external template is complementary to a portion of 
the 3' region of the substrate nucleic acid molecule and 

25 a portion of the 5' region of each of the candidate 

nucleic acid molecules in the population- Alternatively, 
the external template may be complementary to a portion 
of the 5' region of the substrate nucleic acid molecule 
and a portion of the 3' region of each of the candidate 

30 nucleic acid molecules in the population. Binding of the 
external template to the substrate nucleic acid molecule 
and a candidate nucleic acid molecule from the population 
juxtaposes the 3' region of one of the molecules with the 
5' region of the other. 
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One of the terminal nucleotides (either the 5' or 
the 3 ' nucleotide) of the juxtaposed regions may contain 
an activated group. Activated groups that may be used in 
the method of the invention include, but are not limited 
5 to, 5'-phosphoro(2-methyl)imidazolide, a 5'- 

phosphorimidizolide, cyanogen bromide, and carbodiimides 
(e.g., l-ethyl-3-(3'-dimethylaminopropyl) carbodiimide 
(CDI) , l-cyclohexyl-3-(2-morpholinyl-(4)-ethyl)- 
carbodiimide metho-p-toluenesulfonate, CDI-l, and CDI-2) . 

10 As a specific example, the activated group is a 3'- 

phosphorimidazolide on the 3' terminal nucleotide of the 
substrate. Activating groups are added to the nucleic 
acid molecules used in the methods of the invention by 
using methods known in the art. 

15 Alternatively, if desired, this first step 

external templating may be omitted. It is not essential 
to the selection method of the invention. 

In the second step of this method of the 
invention, a subpopulation of nucleic acid molecules 

20 having ligase activity is isolated from the population. 
This may be accomplished by, e.g., affinity 
chromatography followed by selective PCR amplification. 
For example, the substrate nucleic acid and/or the 
nucleic acid from the population may contain the first 

25 member of a specific binding pair (e.g., biotin) . As a 
specific example, the terminal nucleotide of the 
substrate nucleic acid (e.g., the 5' terminal nucleotide 
of the substrate nucleic acid) and/or the nucleic acid 
molecule from the population that is not juxtaposed by 

30 the external template may be labeled with biotin. 
Isolation of molecules containing biotin may be 
accomplished by contacting the molecules with Immobilized 
avidin, e.g., a streptavidin agarose affinity column. 
Other specific binding pairs known to one skilled in the 

35 art may be used in the method of the invention. 
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The isolated subpopulation may be amplified in 
vitro using, e.g., PCR. In selective PGR, the first 
primer is complementary to a sequence of the substrate 
nucleic acid molecule and the second primer, is 
5 complementary to the opposite strand of a sequence in the 
population. Use of these primers therefore results in 
amplification of only those nucleic molecules which are a 
product of the ligation of the substrate to a nucleic 
acid molecule from the population. In order to generate 

10 a population of nucleic acid molecules for further rounds 
of selection, nested PCR amplification may be carried out 
using primers which preferably include the terminal 
nucleotides of the nucleic acid from the population that 
was ligated to the substrate nucleic acid. 

15 The above-described steps of contacting, 

isolating, and amplifying may be repeated on the 
subpopulations of nucleic acid molecules obtained. The 
additional rounds of selection may be carried out in the 
presence or absence of the external template. Nucleic 

20 acid molecules isolated using the above-described method 
may be subcloned into a vector (e.g., a plasmid) and 
further characterized by, e.g., sequence analysis. 

In a second aspect, the invention features a DNA 
molecule capable of acting as a catalyst. A catalyst is 

25 a molecule which enables a chemical reaction to proceed 
under different conditions (e.g., at a lower temperature, 
with lower reactant concentrations, or with increased 
kinetics) than otherwise possible. 

In a third aspect, the invention features a DNA 

30 molecule capable of acting as a catalyst on a nucleic 
acid substrate. This catalysis does not require the 
presence of a ribonucleotide in the nucleic acid 
substrate. 

In a fourth aspect, the invention features a 
35 nucleic acid molecule having ligase activity, e.g., DNA 
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or RNA ligase activity. The nucleic acid molecule may be 
DNA, RNA, or combinations or modifications thereof. 

In a fifth aspect, the invention features a 
nucleic acid molecule capable of ligating a. first 
5 substrate nucleic acid to a second substrate nucleic 

acid. The rate of ligation catalyzed by the nucleic acid 
molecule of the invention is greater than the rate of 
ligation of the substrate nucleic acids by templating 
under the same reaction conditions which include such 

10 variables as, e.g., substrate concentration, 

template/enzyme concentration, nature and quantity of 
base-pairing interactions between substrates and 
template/enzyme, type of activating group, salt, pH, and 
temperature. Templating is the joining of two substrate 

15 nucleic acid molecules when hybridized to contiguous 
regions of a "template" nucleic acid strand. 

In a sixth aspect, the invention features a 
catalytic DNA molecule capable of ligating a first 
substrate nucleic acid to a second substrate nucleic 

20 acid. The first substrate nucleic acid contains the 
sequence 3'-s 1 -s 2 -5', the second substrate nucleic acid 
contains the sequence 3'-S 3 -s 4 -5', and the catalytic DNA 
molecule contains the sequence 5 ' -E 1 -TTT-E 2 -AGA-E 3 -E 4 -E 5 - 
E 6 -3'. 

25 Por tkese substrate and catalytic DNA molecules, 

S 1 contains at least two (for example, 2-100, 4-16, or 8- 
12) nucleotides positioned adjacent to the 3' end of S 2 . 
The s 1 nucleotides are complementary to an equivalent 
number of nucleotides in E 1 that are positioned adjacent 

30 to the 5' end of TTT. 

S 2 contains one - three (for example, 1) 
nucleotides, S 3 contains one - six (for example, 3) 
nucleotides, and the 5' terminal nucleotide of S 2 and the 
3' terminal nucleotide of S 3 alternatively contain an 

35 activated group or a hydroxy 1 group. 
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S 4 contains at least two (for example, 2-100, 4- 
16, or 8-12) nucleotides positioned adjacent to the 5' 
end of S 3 . The S 4 nucleotides are complementary to an 
equivalent number of nucleotides in E 6 that, are 
5 positioned adjacent to the 3 9 end of E 5 . 

E 1 contains at least two (for example, 2-100, 4- 
16, or 8-12) nucleotides positioned adjacent to the 5 9 
end of TTT. The B 1 nucleotides are complementary to an 
equivalent number of nucleotides in S 1 that are 
10 positioned adjacent to the 3' end of S 2 . 

E 2 contains 0-12 nucleotides, for example, 3-4 
nucleotides. 

E 3 contains at least two (for example, 2-100, 3- 
50, 5-20, or 5) nucleotides positioned adjacent to the 3' 

15 end of said AGA, said E 3 nucleotides being complementary 
to an equivalent number of nucleotides in E 5 that are 
positioned adjacent to the 5 ' end of E 6 . 

E 4 contains at least 3 nucleotides (for example, 
3-200, 3-30, 3-8, 4-6, or 5) nucleotides. Alternatively, 

20 E 4 may contain zero nucleotides. In this case, the 3' 
end of B 3 and the 5' end of E 5 would not be linked to 
another nucleic acid segment (e.g., E 4 ) , and the enzyme 
therefore would be made up of two separate nucleic acid 
molecules (the first containing 5'-E 1 -TTT-E 2 -AGA-E 3 -3 r , 

25 and the second containing 5'-E 5 -B 6 -3') . 

E 5 contains at least two (for example, 2-100, 3- 
50, 5-20, or 5) nucleotides positioned adjacent to the 5' 
end of E 6 . The E 5 nucleotides are complementary to an 
equivalent number of nucleotides in E 3 that are 

30 positioned adjacent to the 3 9 end of AGA. 

B 6 contains at least two (for example, 2-100, 4- 
16, or 8-12) nucleotides positioned adjacent to the 3 ' 
end of E 5 . The E 6 nucleotides sure complementary to an 
equivalent number of nucleotides in S 4 that are 

35 positioned adjacent to the 5' end of S 3 . 
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In the case of long stem structures formed by, 
e.g., S 1 and E 1 , S 4 and E 6 , or E 3 and B 5 , the stem 
structures may contain mismatches, provided that a stem 
structure is maintained. 
5 The 5' most nucleotide of S 2 , the 3' most 

nucleotide of S 3 , and the second 3' most nucleotide of s 3 
aay be complementary to the 5' most nucleotide of E 2 , the 
second 5' most nucleotide of e 2 , and the third 5' most 
nucleotide of E 2 , respectively, m addition, E 2 may 
10 contain four nucleotides, and the third 3' most 

nucleotide of s 3 may be complementary to the fourth 5' 
most nucleotide of E 2 . 

In a seventh aspect, the invention features a 
method of ligating a first nucleic acid molecule to a 
15 second nucleic acid molecule. In this method, the first 
and second nucleic acid molecules are contacted with a 
nucleic acid molecule having ligase activity (e.g., dha 
ligase activity) . The nucleic acid molecule having 
ligase activity, as well as the first and second nucleic 
20 acid molecules may contain ON A, RNA, or modifications or 
combinations thereof 

The ease with which DNA oligonucleotides can be 
synthesized and their relatively high stability represent 
major advantages over other biopolymer catalysts, such as 
25 proteins and RNA, for, e.g., industrial, research, and 
therapeutic applications, other features and advantages 
of the invention will be apparent from the following 
description of the preferred embodiments thereof, and 
from the claims. 

30 Detailed Description 

The drawings are first described. 
Drawings 

Fig. 1 is a schematic representation of the in 
vitro selection strategy used to isolate DNA molecules 
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having DNA ligase activity. Each molecule in the single 
stranded DNA (ssDNA) pool contained 116 random bases 
flanked by constant regions having sequences 
complementary to the PCR primers 5'- 
5 GGAACACTATCCGACTGGCACC-3 ' (SEQ ID NO: 29) and 5'-biotin- 
CGGGATCCTAATGACCAAGG-3 ' (SEQ ID NO: 30). The pool was 
prepared by solid-phase phosphoramidite chemistry and 
amplified by PCR (Ellington et al., Nature 355:850-852, 
1992) to yield approximately 32 copies of 3.5 x 10 14 

10 different molecules. Single stranded DNA was prepared 
from the amplified pool as described by Bock et al . 
(Nature 355:564-566, 1992). The activated substrate (5'- 
biotin-AAGCATCTAAGCATCTCAAGC-p-Im (SEQ ID NO: 31)) 
contained a 5'-biotin group and a 3' -phosphor imidazolide 

15 (Chu et al. , Nucleic Acids Res. 14:5591-5603, 1986). 
Eight copies of the DNA pool (0.5 iM) were incubated in 
selection buffer (30 mM Hepes, pH 7.4, 600 mM KC1, 56 mM 
MgCl 2 , 1 mM ZnCl 2 ) with 1 jiM activated substrate and 1 mM 
of an external template (5 f -CGGATAGTGTTCCGCTTGAGATGCTT-3 * 

20 (SEQ ID NO: 32)) complementary to the 5' end of the pool 
and the 3' end of the activated substrate. After a two 
hour incubation, the reaction was stopped by addition of 
EDTA. 0.5% ligated product was present after 24 hr. No 
product formation was observed in the absence of the 

25 external template. At cycle 7, pool activity was 

independent of the external template, indicating that the 
remaining pool molecules were using an internal substrate 
binding site. In cycles 8 and 9, no external template 
was added, and the reaction time was decreased to 2 and 

30 0.5 minutes, respectively, in order to increase selection 
stringency. To isolate ligated molecules, the reacted 
pool was passed through a streptavidin agarose affinity 
column (Pierce, Rockford, IL) , unligated pool was washed 
off the column under denaturing conditions (3 M urea 

35 followed by 150 mM NaOH, 40 column volumes each) , and the 
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ligated pool was specifically eluted with excess free 
biotin (Wilson et al.. Nature, in press, 1995). To 
select for substrate ligation to the 5' -hydroxy 1 of the 
pool molecules, isolated DNA was selectively PCR 
5 amplified (in cycles 6-9 only) with a first primer 
corresponding to the substrate sequence and a second 
primer complementary to the 3' constant region of the 
pool, and gel purified. This pool was then subjected to 
nested PCR with the first set of primers, gel purified, 
10 and re-amplified for ssDNA isolation (Bock et al., Nature 
355:564-566, 1992). Nine cycles of selection- 
amplification were performed, after which the pool 
activity remained constant. 

Fig. 2A is a denaturing acrylamide gel analysis of 
15 a time course of ligation reactions catalyzed by pool 9 
ssDNA. Internally labeled pool 9 DNA (0.5 /iM) was 
incubated with activated substrate (1 pM) in selection 
buffer for the indicated times. In a control reaction, 
the substrate was not activated (lane 5) . DNAs were 
20 separated by electrophoresis in a 6% polyacrylamide/8 M 
urea gel. Radioactivity was detected using a Molecular 
Dynamics Phosphor imager. 

Fig. 2B is a schematic representation of the 
sequences of clones isolated from pool 9 DNA. DNA from 
25 pool 9 was amplified by PCR and cloned into pT7Blue T- 
Vector (Novagen, Madison, WI) . Bach of the clones 
analyzed was sequenced in both directions using the 
standard dideoxy sequencing method. The 21 sequences 
(SEQ ID NOs: 1-21) shown in the figure share a consensus 
30 sequence consisting of two conserved domains (SEQ ID NOs: 
22 and 23) . Upper and lower case letters in the 
consensus indicate highly and moderately conserved 
positions, respectively. X and Z represent non- 
conserved, but complementary bases. The bolded T in 
35 domain I is present in 50% of the clones. 
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Fig* 3A is a schematic representation of the 
proposed secondary structure for the consensus sequence 
of the DNA molecules having DNA ligase activity isolated 
from pool 9 DMA. The 5' end of domain I and the 3' end 
5 of domain II base-pair with the 5' constant region of the 
pool(SEQ ID NO: 25 and the activated substrate, (SEQ ID 
NOL: 24) respectively. The two complementary regions 
("NNNN" of SEQ ID NO: 26 and "NNNN" of SEQ ID NO: 27) 
form a stem structure and bring the flanking domains into 
10 close proximity. Dotted lines indicate possible 

interactions between the bases at the ligation junction 
and the sequence between the two boxed sequences, TTT and 
AGA. 

Fig* 3B is a schematic representation of a minimal 

15 DNA catalyst (SEQ ID NO: 28) . Non-conserved regions in 
the DNA structure shown in Fig. 3A were deleted in order 
to generate a three-fragment complex in which the 
formation of a phosphodiester bond between the 3'- 
phosphorimidazolide substrate SI and the 5' -hydroxy 1 

20 substrate S2 is catalyzed by the 47 nucleotide 
metalloenzyme E47. 

Fig. 3C is a denaturing acrylamide gel analysis of 
a time course of ligation of activated substrate SI and 
radiolabeled substrate S2 by the catalyst E47. No 

25 reaction was detectable when activated Si (lanes 1 and 5) 
or E47 (lane 6) was absent. 

Fig. 3D is a table showing the initial rates of 
ligation catalyzed by E47, E47-3T, E47-AGA, E47-hairpin, 
and pool 9 ssDNA. Activated substrate SI (1 $M) and 

30 radiolabeled S2 (0.5 /iM; S2 was 3'-end labeled using [a- 
32 P]-cordycepin-5' -triphosphate (NEN Dupont, Boston, MA) 
and terminal transferase (Promega, Madison, WI) ) were 
incubated with the different catalysts (0.75 jiM) at 25°C 
Reaction conditions are as in Fig. l, with the following 

35 changes: 30 mM Hepes, pH 7.2, and 4 mM 2nCl 2 . DNA was 
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separated by on a 12% polyacrylamide/8 K urea gel. K obe 
values were determined by fitting fraction ligated 
vs. time to a linear equation using KaleidaGraph, and are 
the average of two independent experiments measured at 
5 less than 20% product formation. E47-3T and E47-AGA are 
E47 derivatives in which the conserved TTT and AGA 
sequences are deleted, respectively. E47-hairpin is an 
E47 derivative in which the hairpin has been replaced by 
5'-CCATG-3'. The background reaction, containing an 
10 external template (see Pig. l), was measured over a six 
hour incubation. No product was detected in the absence 
of the template, corresponding to a maximum rate of 2 x 
10" 5 hr" 1 . 

Fig. 4A is a denaturing acrylamide gel analysis of 

15 an experiment showing the effect of Mg 2 *, Zn 2 *, and Cu 2 + 
on catalysis. Reactions were incubated for 20 minutes at 
the indicated divalent metal ion concentrations. No 
reaction was detected in the absence of Zn 2 * and Mg 2 * 
(lane 2) , or with only Mg 2 * (lane 3) . Mg 2 * is not 

20 required for activity, and Zn 2 * alone (lane 4) catalyzes 
the reaction with the same efficiency as Zn 2 * and Mg 2 * 
together. Cu 2 * is the only divalent metal found that can 
substitute for Zn 2 * (lane 5) ; it does not require Mg 2+ for 
activity. The rate of ligation is independent of 

25 monovalent metal ions. Potassium chloride can be 
substituted by lithium, sodium chloride, or cesium 
chloride, or removed with no significant effect on 
product formation. 

Fig. 4B is a graph showing the effects of zinc (o) 

30 and copper (•) concentrations on product formation. The 
reaction incubation time was 7 minutes. 

Fig. 4C is a graph showing logfl^.) versus pH. 
In the presence of 10 fiH CuCl 2 , there is a linear 
correlation between the log of and pH, with a slope 

35 of 0.7 up to pH 6.8. At higher pH values, the activity 
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decreases linearly with a slope of -0.7. a slope close 
to +1 suggests that proton abstraction is involved in the 
rate determining step of the reaction , while a slope of - 
1 is indicative of proton donation (Fersht, Enzyme 
5 Structure and Mechanism (Freeman, New York, 1985)). The 
observed rate is independent of buffer concentration 
between 30-150 mM. a similar effect was observed with 
Zn 2 + at 4 mM up to pH 7.4. At higher pH, the activity 
drops drastically, possibly due to the formation of 
10 insoluble metal oxides or hydroxides (Bailar, Jr* et al., 
Comprehensive In organic Chemistry (Pergamon Press Ltd., 
1973)). The reaction conditions were as specified in the 
description of Fig. 3. 

Isolation of DNA mol ecule having DNA lipase activity 

*5 Oligodeoxynucleotides can be non-enzymatically 

ligated on either single-stranded (Nay lor et al.. 
Biochemistry 5:2722-2728, 1966) or duplex (Luebke et al., 
J. Am. Chem. Soc. 111:8733-8735, 1989) DNA templates. We 
designed an in vitro selection strategy (Szostak, Trends 

20 Biochem. Sci. 17:89-93, 1992; Chapman et al., Curr. Opin. 
Struct. Biol. 4:618-622, 1994; Breaker et al., Trends 
Biotechnol. 12:268-275, 1994; Joyce, Curr. Opin. Struct. 
Biol. 4:331-336, 1994) in order to determine whether DNA 
sequences which catalyze DNA ligation more efficiently 

25 than non-enzymatic templating could be isolated from a 
large pool of random sequences (Fig. 1) . Using this 
strategy, a small single-stranded DNA that is a Zn 2 +/Cu 2 +- 
dependent metal loenzyme was isolated. The enzyme 
catalyzes the formation of a new phosphodiester bond by 

30 the condensation of the 5' -hydroxy 1 group of one 

oligodeoxynucleotide and a 3' -phosphor imidazolide group 
on another oligodeoxynucleotide, and shows multiple 
turnover ligation. 
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The details of the selection strategy are 
illustrated in Pig. i. After nine cycles of selection 
and amplification, the DMA pool (pool 9) displayed 
efficient ligation activity (Fig. 2A). Incubation of 
5 pool 9 DNA with the activated substrate yields a ligated 
product with the correct molecular weight and the 
expected nucleotide sequence at the ligation junction. 
To analyze further the selected sequences, DNA from pool 
9 was cloned and sequenced. The majority of the clones 

10 contain a common consensus sequence consisting of two 
small domains separated by a spacer region of variable 
length and sequence (Pig. 2B) . The two small domains are 
embedded in entirely different flanking sequences, 
indicating that several independent sequences in the 

15 original pool were carried through the selection process, 
inspection of the consensus sequence suggests a secondary 
structure that is more complex than a simple template, 
but nevertheless brings the 5' -hydroxy 1 group and the' 3 '- 
phosphorimidazolide group into close proximity (Fig. 3A ) . 



20 



Based on the consensus sequence, a small 47 nt 
ssDNA catalyst (E47) was designed that ligates two 
separate DNA substrates, si and S2 (Fig. 3B ) . Incubation 
of radiolabeled S2 with activated substrate SI and E47 
catalyst results in the appearance of the expected 
25 ligated product (Fig. 3C) . Product formation requires 
that all three components are present in the reaction. 
In addition, the 3 '-phosphate group of si must be 
activated. E47 catalyzes the ligation reaction twice as 
fast as pool 9. Small deletions within E47 result in 
30 severe losses of catalytic efficiency (Fig. 3D), 
indicating that the central consensus sequence is 
necessary for catalysis. The initial rate of ligation of 
SI and S2 by E47 is 3400-fold greater than the rate of 
the same reaction catalyzed by a simple complementary 
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template under the sane conditions, and is at least 10 s - 
fold faster than the untemplated background ligation 
(Pig. 3D). This rate enhancement is comparable to values 
obtained for ribozymes obtained by in vitro, selection 
5 (Szostak, Trends Biochem. Sci. 17:89-93, 1992; Chapman et 
al. f Curr. Opin. Struct. Biol. 4:618-622, 1994; Breaker 
et al., Trends Biotechnol. 12:268-275, 1994; Joyce, Curr. 
Opin. Struct. Biol. 4:331-336, 1994) and catalytic 
antibodies (Lerner et al., Science 252:659-667, 1991). 

i0 Since the catalyst is not consumed in the 

reaction, it was expected that E47 would be capable of 
catalyzing the ligation of several molar equivalents of 
substrates SI and S2, provided that the ligated product 
is able to dissociate from the enzyme. At saturating 

15 concentrations (140 jxM) of both substrates and 1 pM E47, 
multiple turnover catalysis at a rate of 0.66 hr" 1 at 25*C 
and 2.4 hr" 1 at 35 Q C was observed (10 turnovers observed). 
At these temperatures, product release appears to be rate 
limiting, as a rapid initial burst of approximately one 

20 equivalent of product formation was observed within the 
first 10 minutes of the reaction. The initial rate of 
ligation in this burst phase was directly proportional to 
the concentration of B47 over a 30-fold range, as 
expected for an enzyme at saturating substrate 

25 concentration (Fersht, Enzvme St ructure and Mechanism 
(Freeman, New York, 1985)). A plot of K obfl vs. [E47J 
yielded a k cat of 3.2 hr* 1 (0.07 min' 1 } at 25*C. 

Because divalent metal ions play a crucial role in 
ribozymes (Pyle, Science 261:709-714, 1993) and many 

30 protein enzymes (Karlin, Science 261:701-708, 1993), it 
was expected that the DNA catalyst would require either 
Mg 2+ and/ or Zn 2 + for activity, as these ions were present 
in the selection buffer. Indeed, the ligation reaction 
is dependent on Zn 2+ (Fig. 4 A) , but does not require Mg 24 . 

35 All of the members of the Irving-Williams series (Ba 2 +, 
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Sr 2 *, Ca 2 *, Kg 2 *, Mn 2 *, Fe 2 +, Co 2 *, Ni 2 +, Cu 2 *, Zn 2 *) , as 
well as Pb 2 * and Cd 2 +, were tested at concentrations 
between 10 *&M and 10 mM, and it was found that only Cu 2 * 
could substitute for Zn 2 *. The efficiency of the ligation 
5 reaction is highly dependent on the divalent metal ion 
concentration (Pig. 4B) . Increasing concentrations of 
Zn 2+ up to 4 mM enhanced activity, but at higher 
concentrations the activity dropped sharply, suggesting 
the existence of inhibitory metal binding sites. A 
10 similar concentration dependence was observed for copper, 
but at a 400-fold lower concentration. The metal ion 
specificity suggests the existence of one or more metal 
ion binding sites with stringent geometrical and/or size 
requirements. 

15 To gain insight into the ligation mechanism, the 

pH-rate profile of the reaction under pre-steady-state 
(single turnover) conditions was determined (Fig. 4C) . 
The bell shaped profile displayed with Cu 2 + suggests that 
the rate limiting step of the ligation reaction depends 

20 in part on two ionizable groups, once acidic and one 
basic, raising the possibility of a general acid-base 
mechanism (Fersht, Enzvme Structure and Mechanism 
(Freeman, New York, 1985)) in which copper complexes are 
involved in proton transfer. Metal-ion hydroxides are 

25 thought to act as general bases in some ribozyme-mediated 
RNA cleavage reactions (Pyle, Science 261:709-714, 1993; 
Dahm et al.. Biochemistry 32:13040-13045, 1993; Pan et 
a!., Biochemistry 33:9561-9565, 1994). Other 
possibilities, such as pH-dependent folding effects, may 

30 also account for these observations (Kao et al., Proc. 
Natl. Acad. Sci. USA 77:3360-3364, 1980). 

E47 and substrates SI and S2 were modified so that 
ligation of the modified substrates by the modified 
enzyme results in formation of a ligated product having 

35 the sequence of the modified enzyme. The sequences of 
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three such enzymes (E) , and their corresponding 
substrates (SI and S2) f are as follows: 

I. E: 5'- 

ACCTTCACCTTCTTTCGCTAGACCTTCAAGCGGAAGGTGAAGGT 
5 CTAGCG-3' (SEQ ID NO: 33) 

SI : 5' -ACCTTCACCTTCTTTCGCTAGACCTTCAAGC- 3 ' 
(SEQ ID NO: 34) 

S2: 5 9 -GGAAGGTGAAGGTCTAGCG-3 ' (SEQ ID NO: 35) 

II. E: 5'- 

10 ACCTTCACCTTCTTTCGCTAGACCTTCAAGCGGAAGGTGAAGGT 

CTA-3' (SEQ ID NO: 36) 

SI : 5 9 -ACCTTCACCTTCTTTCGCTAGACCTTCAAGC-3 ' 
(SEQ ID NO: 34) 

S2: 5 ' -GGAAGGTGAAGGTCTA-3 9 (SEQ ID NO: 37) 

15 III. E: 5 9 -CTTCACCTTCTTTCGCTAGACCTTCAAGCGGAAGGTGAAGGT 

CTA-3' (SEQ ID NO: 38) 

SI : 5 ' -CTTCACCTTCTTTCGCTAGACCTTCAAGC-* 3 ' 
(SEQ ID NO: 39) 

S2: 5 9 -GGAAGGTGAAGGTCTA-3 9 (SEQ ID NO: 37) 

20 The differences between these enzymes and E47 are in (1) 
the stem formed between E47 and the 5'-hydroxyl- 
containing substrate S2, (2) the stem formed between E47 
and the activated substrate SI, (3) the intramolecular 
stem in E47, and (4) the loop in E47. The sequence of 

25 the presumed core of the ligation site was not changed. 
The modified enzymes differ from one another only in the 
number of base pairs between the enzyme and the 
substrates. The modified enzymes catalyze ligation of 
their respective substrates, which shows that the primary 

30 nucleotide sequences of at least some parts of the stem 
and loop structures depicted in Fig. 3B are not required 
for enzyme activity, and further that the unchanged 
regions of the enzyme are sufficient for maintenance of 
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ligase activity, in the presence of the stem structures 
defined by S 1 -E 1 and S 4 -E*. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS The General Hospital Corporation 
(ii) TITLE OF INVENTION: CATALYTIC DNA 
(iii) NUMBER OF SEQUENCES x 39 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish 6 Richardson P,C. 

(B) STREET: 225 Franklin Street, 

(C) CITY: Boston 

(D) STATE: MA 
(B) COUNTRY: USA 
(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/487,867 

(B) FILING DATE: 07-JUN-1995 

(C) CLASSIFICATION: 

(vlii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Lech, Karen F. 

(B) REGISTRATION NUMBER: 35,238 

(C) REFERENCE /DOCKET NUMBER: 00786/273001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 542-5070 

(B) TELEFAX: (617) 542-8906 

(C) TELEX: 200154 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
TATGTGTCGA TTGTGTTCTT TCGCTAGACC ATGTGAGACT TATGCTTCGA ATTGTCGAGT 60 
TTTTGACTGT TTGCTTGGCC GGCTGGTGGT CGTGCATGGT GAGATGATTA CCCTA 115 
(2) INFORMATION FOR SEQ ID NO:2: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) length: 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS i single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 



(xi) SEQUENCE DESCRIPTIONS SBQ ID NO: 2: 
TATGTCTCGA TTGTGTTCTT TOCCTAGACC ATGTCGGACT TATGCTTCCA ATTGTCGACT 60 
TTTTGACTGT TTGCTTGGCT GGCTGGTGCC CGCGCATGCT GAGATGATTA TCCCT 115 
(2) INFORMATION FOR SEQ ID NO:3s 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 116 base pairs 

(B) TYPBt nucleic acid 

(C) STRANDED NESS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DKA 



(xi) SEQUENCE DESCRIPTION : SEQ ID BO: 3: 
TATGTGTCGA TTGTGTTCTT TCGCTAGACC ATCTGAGACT TATGCTTCCA ATTGTCGACT 
TTTTGACTGT TTGCTTGGCC GGCTGGTGGT CGCGCATGCT GAGATGATTA TCCCTA 
(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 117 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DMA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 4: 
TATGTGTCGA TTGTGTTCTT CCGCTAGACC ATCTGAGACT TATGCTTCCA ATTGTCGACT 
TTTTGACTGT TTGCTTGGCC GGCTGGTGGT CGCGCATGCT GAGATGATTA TTCCCTG 
(2) INFORMATION FOR SBQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 116 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY s linear 
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(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOi5l 
TATAGTCACC CTGGTAGGGT TCTTTCGCAG ACTGCGATGT CTTTTGATTT GAACTTATTT 60 
. ATGAGGTCTG TTGAACCCCA TTGCGACTGA GT6CTTGCTG CTTCTTACTT TCCCTT 116 
(2) INFORMATION FOR SEQ ID NO:6r 

(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 116 base pairs 

(B) TYPE i nucleic acid 

(C) STRANDED NESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TATAGTCAGO CTGGTAGGGT TCTTTCGCAG ACTGCGATGT CTTTTGATTT GAACTTATTT 60 
ATGAGGTCTG TTGAACCCCA TTGCGACTGA GTG C TTGCTO CTTCTTACTT TCCCAT 116 
(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS t Single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TATAGTCACC CTGGTAGGGT TCTTTCGCAG ACTGCGATGT C TTTTG ATTT GAACTTATTT 60 
ATGAGGTCTG TTGAACCCCA TTGCGACTGA GTCC T TC CGC CTTGTTACTT TCCCAT 116 
(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 
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TATAGTCAGG CTGGTAGGGT TCTTTCGCAG AGTGCGATCT GTTTTGATTT GAACTTATTT 60 
ATGAGGTCCG TTGAAGCTCA TTGCGACTGA GTGCTTGCTO CTTGTTACTT TCCCAC 116 
(2) INFORMATION FOR SBQ ID NO:9: 

(1) SEQUENCE CHARACTERISTICS t 

(A) LENGTH i 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS t single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE* DMA 



(xi) SEQUENCE DESCRIPTION I SEQ ID NO: 9: 

OGTTTCGTTT TGGAAGGCCT GTTGGTCCTT GTGTTCTCTC CCAGACCACT TTTTOGTACA 60 

OGGAAGTGGA TTAAGTGGTG AGTTGCTTTC TAGTATGCGC TTTGAGGTAT TCTATG 116 

(2) INFORMATION FOR SEQ ID NOslOs 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTHS 116 base pairs 
<B) TYPBs nucleic acid 
<C) STRANDEDNESSt single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTION s SBQ ID NOslOs 
CGTTTCGATT TGGAAGGCCT GTTGGTCCTT GTGTTCTCTC CCAGACCACT TTTTCGTTCA 60 
CGGAAGTGGA ATAAGTGGTG AGTTGCTTTC TAGTGTGCGC TTTGAGGTAT TCTATG 116 
(2) INFORMATION FOR SEQ ID NOslls 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSSS single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SBQ ID NOslls 
OGTTTCGTTT TGGAAGGCCT GTTGGTCCTT GTGTTCTCTC GCAGACCACT TTTTCGTTCA 60 
OGGAAGTGGA TTAAGTGGTG AGTTGCTTTC TAGTGTGCGC TTTGAGGAAT TCTATG 116 
(2) INFORMATION FOR SBQ ID NO: 12: 
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(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 116 base pairs 

(B) TYPE* nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTIONS SBQ ID NOxl2t 
CCTCTTGCTO CGTTTTTCCT CGGTATCGTT CTTTCGCTAO ACCTTTAAAT AATGGTGAGA 60 
TCCTCTTTTT CAGGCTAGTA GCGOGGGATT GGGCGTTACC GTCGTTTGTC TTTCGA 116 
(2) INFORMATION FOR SBQ ID NOsl3t 

<i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH t 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) topology t linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 1 
CACGTACTTC TTGTAGACGT GTGGCTTTGA TAGGATGTGG TCTTTCGCTA GAGTTAATTA 60 
CCTGTGGACC CTTAAGCTGT CTTAACTGAO ATGCTTTCAT TTTGTCTTTC TGATT H5 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS r single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xl) SEQUENCE DESCRIPTION: SBQ ID NO: 14: 
GAGCGTGGCT AACTGGATAG TGGTCTCTCG CTAGACACCT GTGTGAGATT GTTAGAATGC 60 
GGTCCATCTG CCTATTTGCT AGTTAAGGGT TTATGCTCTT CCTCTGATCA CTTTCG 116 
(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE l DNA 



(xi) SEQUENCE DESCRIPTION t SBQ ID BO: 15: 
CTTTTTGTGT TTGAOGAATA OCT6TTCTTT CGCAGACCTT GTGCATCTTT GTTGTCGCAA 
GCTGAGATGC TTGTCTTGTT TCCTTTTTCA IGTTTC C TT C TCCTTOTTTT TAAAC 
(2) INFORMATION FOR SBQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 116 base pairs 

(B) TYPBs nucleic acid 
<C) STRANDEDNBSS: single 
(D) TOPOLOGYt linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 16: 
TTQTGGTTGT GACCGGTTAG CATAGTGTTA TTTCGCAGAC CACATCACCG TATTTTGGTO 
ACTGGTGAGA TGCTGCTATT TTCTCGTCTT CCACCCGCTT AAATACTTCO AGGTTT 
(2) INFORMATION FOR SEQ ID NO:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TTTGGTTTCG CAGTTGGTGT CTTCCTTCGC AGACCCTTTG GGTGAGATTG CTTTTCCGCC 
TTTGACTCAT CCTCCCTTGT GGTATTGTTG TGCATGTGAT AGCTTGTTCT GCTCAT 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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TGGGGATCCC GGTATTAGTG TCTGCGTACT TTGGCTGACG GTGGCCOTOG TGOTATCTCT 60 

GTTCTCTCCC ATCATCCAAT CTTCCCGGTT GGATGAGATO CTTGATTATG CTTA 114 

(2) INFORMATION FOR SBQ ID NOtl9t 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 117 baee pairs 
<B) TYPE j nucleic acid 
(C> STRAND EDNESS t Single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION s SEQ ID NO: 19s 
TTTCTTGGGC TTAAGCTCGG TTATTGTTCT TTCGCTAGAT CCATGTCTAT ATTATGGTTG 60 
GGCCGACTGG TTTTTTACTT ATACTATTGT TTTTGTGGOG TGGATGAGAT GCTGTTT 117 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH t 116 base pairs 

(B) TYPBs nucleic acid 

(C) STRANDSDNESSs single 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOs20i 
TCAGGTGTTT TTGTTTTTCT GAGCAGGGAG TCGGTGTGTT CTTTCGCAGA CAOGA CTT TT 60 
TTGTGTGAGA TTGCTTAGTG TTCTTTGTTC AATCACTAGA TTTCTTGATG GGTGTG 116 
(2) INFORMATION FOR SBQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS i Single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SBQ ID NO: 21: 
GTCGGTTCAT GTTGTTCTTT CGCCAGATGA TCGCGGCGTT TTAGTTTAOG TCACTCGACG 60 
TATTTTCTAC GGGGTTTAGG CTTTGTCGAT CATGAGTTGC TTAGATTGAT TTTTT 115 
(2) INFORMATION FOR SBQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH! 27 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY t linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID ROs22s 
CGGATACTGT TCTTTCCCTA GANNNNN 
(2) INFORMATION FOR SEQ ID NO: 23 s 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTHS 15 base pairs 

(B) TYPE i nucleic acid 

(C) STRAND ED NESS : single 
<D) TOPOLOGY i linear 

(li) MOLECULE TYPEs DNA 



(xi) SEQUENCE DESCRIPTION s SEQ ID NO:23s 
NNNNNTGAGA TGCTT 

(2) INFORMATION. FOR SEQ ID NO? 24s 

(i) SEQUENCE CHARACTERISTICS j 
(A) LENGTHS 13 base pairs 
(8) TYPSs nucleic acid 

(C) STRANDED NESS S single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPEs DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOs24s 
AAGCATCTGA AGC 

(2) INFORMATION FOR SEQ ID NO: 25 s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 13 base pairs 

(B) TYPEs nucleic acid 

(C) STRAND ED NESS ; single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPES DNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 25s 
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GGAACACTAT COG 

(2) INFORMATION FOB SBQ 10 NO:26t 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 26 base pairs 

(B) TYPE i nucleic acid 

(C) STRANDED HESS s single 

(D) TOPOLOGY: linear 

<il) MOLECULE TYPE l DNA 



(xi) SEQUENCE DESCRIPTION: SBQ ID RO:26: 
CGGATAGTGT TCTTTCGCTA GANNNN 
(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS: single 
<D) TOPOLOGY i linear 

(li) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 27* 
NNNNTGAGAT GCTT 

(2) INFORMATION FOR SBQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDBDNBSS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCB DESCRIPTION : SEQ ID NO:28: 
CGGATAGTGT TCTTTCGCTA GACCATGTGA CGCATGGTGA GATGCTT 
(2) INFORMATION FOR SBQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDBDNBSS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION* SEQ ID NO:29: 
GGAACACTAT CCGACTGGGA CC 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE i nucleic acid 
<C) STRANDBDNESS I single 
<D) topology t linear 

(ii) MOLECULE TYPES DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 30: 
CGGGATCCTA ATGACCAAGG 
(2) INFORMATION FOR SEQ ID NO: 31s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS I single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOs31s 
AAGCATCTAA GCATCTCAAG C 
(2) INFORMATION FOR SEQ ID NOs32s 

(i) SEQUENCE CHARACTERISTICS S 

(A) LENGTHS 26 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDBDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CGGATAGTGT TCCGCTTCAG ATGCTT 
(2) INFORMATION FOR SEQ ID NOs33s 

(i) SEQUENCE CHARACTERISTICS t 

(A) lengths 50 base pairs 

(B) TYPE i nucleic acid 

(C) STRANDBDNESS: single 
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(D) topology t linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

ACCTTCACCT tctttcgcta gaccttcaag cggaaggtga aggtctagcg 

(2) INFORMATION POR SEQ ID NO: 34: 

(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH j 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY i linear 

(11) MOLECULE TYPE I DNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOi34s 
ACCTTCACCT TCTTTCGCTA GACCTTCAAC C 
(2) INFORMATION FOR SEQ ID NO:35x 

(1) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
GGAAGGTGAA GGTCTAGCG 
(2) INFORMATION FOR SEQ ID NO: 36: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
ACCTTCACCT TCTTTCGCTA GACCTTCAAC CGGAAGGTGA AGGTCTA 
(2) INFORMATION FOR SEQ ID NO: 37: 
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(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 16 base pairs 

(B) TYPEt nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SSQ ID NOt 37: 
GGAAGGTGAA GCTCTA 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CTTCACCTTC TTTCGCTAGA CCTTCAAGCG GAAGCTGAAO GTCTA 

(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNBSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39: 
CTTCACCTTC TTTCGCTAGA CCTTCAAGC 



What is claimed is: 
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CLAIMS 

1. A method for obtaining a nucleic acid molecule 
having ligase activity, said method comprising the steps 
of: 

a) providing a population of candidate nucleic 
acid molecules, each having a region of random sequence; 

b) contacting 6aid population with: 

(i) a substrate nucleic acid molecule; and 

(ii) an external template complementary to a 
portion of the 3' region of said substrate nucleic 
acid molecule and a portion of the 5' region of 
each of the candidate nucleic acid molecules in 
said population, wherein binding of said external 
template to said substrate nucleic acid molecule 
and a candidate nucleic acid molecule from said 
population juxtaposes said 3' and 5' regions, and 
the terminal nucleotide of either said 3' or said 
5' region contains an activated group; 

c) isolating a subpopulation of nucleic acid 
molecules having ligase activity from said population; 

d) amplifying said subpopulation in vitro; 

e) optionally repeating steps b-d for said 
amplified subpopulation; and 

f ) isolating said nucleic acid molecule having 
ligase activity from said amplified subpopulation. 

2. The method of claim 1, wherein said optional 
repeating of steps b-d is carried out in the absence of 
said external template. 

3. The method of claim 1, wherein said nucleic 
acid molecule having ligase activity or said substrate 
nucleic acid molecule is DNA. 
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4* The method of claim 1, wherein the 5' terminal 
nucleotide of said substrate nucleic acid contains a 
biotin moiety. 

5. The method of claim 1, wherein said activated 
group is a 3 ' -phosphor imidazolide on the 3' terminal 
nucleotide of said substrate. 

6. A method for obtaining a DNA .molecule having 
ligase activity , said method comprising the steps of: 

a) providing a population of candidate DNA 
molecules, each having a region of random sequence; 

b) contacting said population with a substrate 
nucleic acid molecule; 

c) isolating a subpopulation of DNA molecules 
having ligase activity from said population; 

d) amplifying said subpopulation in vitro; 

e) optionally repeating steps b-d for said 
amplified subpopulation; and 

f ) isolating said DNA molecule having ligase 
activity from said amplified subpopulation. 

7. The method of claim 6, wherein said substrate 
nucleic acid molecule is DNA. 

8. The method of claim 6, wherein the 5' terminal 
nucleotide of said substrate nucleic acid contains a 
biotin moiety. 

9. The method of claim 6, wherein said activated 
group is a 3'-phosphorimidazolide on the 3' terminal 
nucleotide of said substrate. 

10. A DNA molecule capable of acting as a 
catalyst. 
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11. A DNA molecule capable of acting as a 
catalyst on a nucleic acid substrate, said catalysis not 
requiring the presence of a ribonucleotide in said 
nucleic acid substrate, 

5 12. A nucleic acid molecule having ligase 

activity. 

13. The nucleic acid molecule of claim 12, 
wherein said nucleic acid molecule is DNA. 

14. The nucleic acid molecule of claim 12, 
10 wherein said ligase activity is DNA ligase activity. 

15. A nucleic acid molecule capable of ligating a 
first substrate nucleic acid to a second substrate 
nucleic acid, wherein the rate of said ligating is 
greater than the rate of ligating said substrate nucleic 

15 acids by templating under the same reaction conditions. 
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16. A catalytic DMA molecule capable of ligating 
a first substrate nucleic acid to a second substrate 
nucleic acid, said first substrate nucleic acid 
comprising the sequence 3 / -S 1 -S 2 -5' # said second substrate 
nucleic acid comprising the sequence 3'-s 3 -S 4 -5', and said 
catalytic DNA molecule comprising the sequence 5'-E 1 -TTT- 
E 2 -AGA-E 3 -E 4 -B 5 -E 6 -3 ' , wherein 

S 1 comprises at least two nucleotides positioned 
adjacent to the 3' end of s 2 , said S 1 nucleotides being 
complementary to an equivalent number of nucleotides in 
B 1 that are positioned adjacent to the 5' end of said 
TTT; 

S comprises one - three nucleotides, S 3 comprises 
one - six -nucleotides, and the 5' terminal nucleotide of 
S 2 and the 3' terminal nucleotide of S 3 alternatively 
contain an activated group or a hydroxy 1 group; 

S 4 comprises at least two nucleotides positioned 
adjacent to the 5' end of S 3 , said S 4 nucleotides being 
complementary to an equivalent number of nucleotides in 
E 6 that are positioned adjacent to the 3' end of B 5 ; 

E 1 comprises at least two nucleotides positioned 
adjacent to the 5' end of said TTT, said E x nucleotides 
being complementary to an equivalent number of 
nucleotides in S 1 that are positioned adjacent to the 3' 
end of s 2 ; 

E 2 comprises zero - twelve nucleotides; 

E 3 comprises at least two nucleotides positioned 
adjacent to the 3' end of said AGA, said E 3 nucleotides 
being complementary to an equivalent number of 
nucleotides in E 5 that are positioned adjacent to the 5' 
end of E 6 ; 

E 4 comprises 3-200 nucleotides; 

E 5 comprises at least two nucleotides positioned 
adjacent to the 5' end of E 6 , said E 5 nucleotides being 
complementary to an equivalent number of nucleotides in 
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E 3 that are positioned adjacent to the 3' end of said 
AGA; and 

E 6 comprises at least two nucleotides positioned 
adjacent to the 3' end of E 5 , said E 6 nucleotides being 
5 complementary to an equivalent number of nucleotides in 
S 4 that are positioned adjacent to the 5' end of S 3 . 

17. The catalytic DNA molecule of claim 16, 
wherein E 2 comprises three - four nucleotides. 

18. The catalytic DNA molecule of claim 17 , 

10 wherein the 5' most nucleotide of S 2 is complementary to 
the 5 9 most nucleotide of E 2 ; the 3' most nucleotide of s 3 
is complementary to the second 5' most nucleotide of E 2 ; 
and the second 3' most nucleotide of S 3 is complementary 
to the third 5' most nucleotide of E 2 . 

15 19. The catalytic DNA molecule of claim 18 , 

wherein E 2 comprises four nucleotides, and the third 3' 
most nucleotide of S 3 is complementary to the fourth 5' 
most nucleotide of E 2 . 

20. The catalytic DNA molecule of claim 16, 

20 wherein 

a) S 2 comprises one nucleotide; 

b) S 3 comprises three nucleotides; 

c) E 4 comprises five nucleotides; or 

d) E 5 and E 3 each comprise five nucleotides. 

25 21. A method of ligating a first nucleic acid 

molecule to a second nucleic acid molecule, said method 
comprising contacting said first and said second nucleic 
acid molecules with a nucleic acid molecule having ligase 
activity. 
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22. The method of claim 21, wherein said nucleic 
acid molecule having ligase activity is DNA, 

23. The method of claim 21, wherein said ligase 
activity is DNA ligase activity. 

24. A nucleic acid molecule having ligase 
activity obtained by the steps of: 

a) providing a population of candidate nucleic 
acid molecules, each having a region of random sequence; 

b) contacting said population with: 

(i) a substrate nucleic acid molecule; and 

(ii) an external template complementary to a 
portion of the 3' region of said substrate nucleic 
acid molecule and a portion of the 5' region of 
each of the candidate nucleic acid molecules from 
said population, wherein binding of said external 
template to said substrate nucleic acid molecule 
and a candidate nucleic acid molecule in said 
population juxtaposes said 3' and 5' regions, and 
the terminal nucleotide of either said 3' or said 
5' region contains an activated group; 

c) isolating a subpopulation of nucleic acid 
molecules having ligase activity from said population; 

d) amplifying said subpopulation in vitro; 

e) optionally repeating steps b-d for said 
amplified subpopulation; and 

f ) isolating said nucleic acid molecule having 
ligase activity from said amplified subpopulation. 

25. The nucleic acid of claim 24, wherein said 
optional repeating of steps b-d is carried out in the 
absence of said external template. 
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26. The nucleic acid molecule having ligase 
activity of claim 24, wherein said nucleic acid molecule 
having ligase activity is DNA. 

27. The nucleic acid molecule having ligase 
activity of claim 24 , wherein 

a) the 5' terminal nucleotide of said substrate 
nucleic acid contains a biotin moiety; or 

b) said activated group Is a 3'- 
phosphorimidazolide on the 3' terminal nucleotide of said 
substrate. 

28. A DNA molecule having ligase activity 
obtained by the steps of: 

a) providing a population of candidate DNA 
molecules, each having a region of random sequence; 

b) contacting said population with a substrate 
nucleic acid molecule; 

c) isolating a subpopulation of DNA molecules 
having ligase activity from said population; 

d) amplifying said subpopulation In vitro; 

e) optionally repeating steps b-d for said 
amplified subpopulation; and 

f ) isolating said DNA molecule having ligase 
activity from said amplified subpopulation. 

29. The DNA molecule having ligase activity of 
claim 28, wherein 

a) the 5' terminal nucleotide of said substrate 
nucleic acid contains a biotin moiety; or 

b) said activated group is a 3'~ 

phosphor imidazolide on the 3' terminal nucleotide of said 
substrate. 



WO 96/40723 PCT/US96/09358 

1/8 



f 



CO 



CO 



CO 



CO 



c 
o 

T5 
cd 
© 

C 

o 

CO 



CO 



0 



X 



in 



CO CO 
t -O 

CO O CO 

_f<2 

lO CO CO 



CO 




CD 



in 



c 

CD 

o 



CL 

6 

CO 

a: 
o 

CL 

2 
CO 
CD 
2 



CD 

LL 



@ 

in 



SUBSTITUTE SHEET (RULE 26) 



WO 96/40723 



PCT/US96/09358 



Ligated pool 9 



Pool 9 

reaction 
time (min) 




0 10 20 40 40 
FIG. 2A 



Activated S1 1 jiM 




+ 


+ 


+ 




+ 


S2 0.5 nM 


+ 


•f 


+ 


+ 


+ 


+ 


E47 0.75 |iM 


+ 


+ 


+ 


+ 


+ 




Non-activated S1 1 nM 










+ 




time min. 


30 


5 


10 


30 


30 


30 


Ligated S2 

3. 3C 

S2 





S18STITUFE SHEET (RULE 26) 



WO 96/40723 



SeqOl 
Seq02 
Seq03 
Seq04 
Seq05 
SeqOS 
Seq07 
Seq08 
Seq09 
SeqlO 
Seqll 
Seql2 
Seql3 
Seql4 
SeqlS 
Seql6 
Se'ql7 
Seql8 
Seql9 
Seq20 
Seq21 



10 



20 



3/8 



30 



40 



PCT/US96/09358 



50 60 



tatgtExcqK 

TATGT 



iWWw^TT THH5 I AG AOT^c 



TGAGACTTATGCTTCGAATTGTCGAGT 
AgGf ltjGGACTTATGCTTCGAATTGTCGAGT 
rGAGACTTATGCTTCGAATTGTCGAGT 
TGAGACTTATGCTTCGAATTGTCGAGT 



TATG TCTCG A1 _ 
TATACTC^GG CTEmaSSST^ 
TATAGTCAGGCTKSaagGG^^ 

TATAGTCAGGC TBfil&ggGTTQ &r TTATTT 
TATAGTCAGGCTlCOTAGGGTTrt** 
CGTTTTOTTTTGGAAGGCCTGTTGGTC dTTGTGT^ 
CGTTTCGATTTGGAAGGCCTGTTGGTC 
CGTTTCGTTTTGGMGGCCTGTTGGTCC 
CGTCTTGCTGGGTTTTTGCT 



CACGTACTTCT TGTAGACGTG TGGCTTTGATA 
GAGCGTGGCTAAE mfiAIAfi: 
GTTTTTGTGTTTG ACGAATA 



l^je^^i^TCTSgIiAGAyr<KT^ 



****** Weiiwii«TTTa«TOAGA^ 



ifr TC TXM«il tAGA «ET» 51 cjTJ? j? 



^Ct»WrtTTT]iWAGAcIigOTc; 



tHw^TTTS^AGA?TOTiy? 



egCTrawBGTTSS AG A cTS{5 



TTTCGTTCA 
TTTCGTTCA 

*TA 



AGATTCTOAGAATGC 
rGCATCTTTGTTGTC dQAA 
rCACCGTAT TTTGGTG 
mgfflSGSsS^^TTTGCGGC 



TTGTGGTTGTGACCGGl TrAGGATAG 

TTTGGTTTCGCAGTT _____ 

TGGGGATCGCGGTATTAGTGTGTCCGTACTTTGGCTGACGGTGGCCGTCGTC 
TTTCTTCGCCTtAAGCTl^ 
TCAGGT(H»T TTTGTOT 

crecGTTCAiivrreTTQt*^ 



70 



80 



90 



100 



110 



TTTTGACTGTTTGCTTGGCCGGCTGGTGGTCGT 
TTTTGACTGTTTGCTTGGCTGGCTGGTGGCCGC 



GAGATGATTlA CCCTA 



~_ 'GftGftTQftTTt VTCCCT 

TTTTGACTGTTTGCTTGGCCGGCTGGTGGTCGCG^^^^^^^^^^^^^\TCCXTA 
TTTTGACTGTTTGCTTGGCCGGCTGGTCCT 

ATGAGGTCTGTTGAAGCCCATTGC^^^^^^^GCTGCTTGTTACTTTCCCTT 
ATGAGGTCTGTTGAAGCCCATTG^^^^^^^GCTGCTTGTTACTTTCCCAT 
ATGAGGTCTGTTGAAGCCCATTG^^^^^^^GCGGCTTGTTACTTTCCCAT 
ATGAGGTCCGT TCAAGCTCATTGQGyEflTGA 

CGGAAGTGGATTlAAQTGGfTGAGTTGQ^flrCTAGTATGCGCTTTGAGGTATTCTATG 



CGGAAGTGGAAm^flXPGf^AGTTC 

CGG^ GTGGATTOaSTGGfTOAGTTGC^lfrOT 

I5533TTTTTGAGGCTAGTAGCGCGG 

G CTGTG G ACCCTT AAGGTGT QT.TAACfTG AG ATG CTTt PC A TTTTG TCTTTCTG ATT 

ggtcc atctgcct atttggt agttaagggtttatg ctgttcctctg atc actttcg 
g^^atg5^tg ttgtttgctttttcatc 

WTggTGAGMfiCB^ATTTTGTGGTGTTG 
TTTGAGTCATCCTCCCTTCT GGTATT CTTGTGCATGTC 

GGCC^CTGGTTTTTTACTTATACTATTGTTTT 
TTmcaroAGATTGCTTkGTGTTCTTTnTT^ 
TATTTTCTACGGGGTTTAGGCTTlf 



rGAGTTnrTTK nATTfiATTTTTT 



Consensus: 5' -(5-58) 
jcggataGTGTTO 



- (2-60)- 



Domain I- 



rGAGATgctq - (4-62) -3 ' 
-Domain II— ^ ^ 



SUBSTITUTE SHEET (RULE 26) 



PCIYUS96/09358 

4/8 




SUBSTITUTE SHEET (RULE 26) 



WO 96/40723 



PCI7US96/09358 



5/8 



Activated 
substrate S1 



5'-Hydroxyl 
substrate S2 



\ 



5' HO 



3' — IGCCTATCACAAGGl [cga 
I I \ \ i i i i 1 t rj~ — — ^ 

5'lCGGATAGTGTTCl 



CGCT 





Catalyst E47 



FIG. 3B 



SUBSTITUTt SHEET (RULE 26) 



WO 96/40723 



6/8 



PCT/US96/09358 





*obs 1,1 


£47 


3.4 


E47-3T 


<0.01 


E47-AGA 


<0.01 


E47-hairpin 


0.41 


Pool 9 


1.7 


templated bkgrd. 


0.001 1 


background 


< 2x1 0" 5 



FIG. 3D 




SUSSTmilE SHEET (RUIE*) 



WO 96/40723 



7/8 



PCT/US96/D9358 





1 


2 


3 


4 


5 


Mg^ + 50 mM 


+. 




+ 




+ 


Zn 2+ 4 mM 












Cu 2+ 10nM 

















FIG. 4A 



SUBSTITUTE SHEET (RULE 26) 



WO 96/40723 



8/8 



PCT/US96/09358 



log (k obs ) 



-1.2 


1 — 


T 


"-I 1 1 — 


T 


-1.4 






• 




-1.6 






• 




-1.8 




• 


• 




-2 


• 




• 




-2.2 


_ • 








-2.4 






• 




•2.6 










-2.8 


L. 


— L_ 


— 1 1 i 


• 

• 



5.5 6 6.5 7 7.5 8 8.5 9 



P H 

FIG. 4C 



SU8SmUFE SHEET (RULE 26) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/US96/09358 



A. CLASSIFICATION OP SUBJECT MATTER 
IPQ6) :C07H 21/04; C12Q 1/68; C12P 19/34 
USCL -.435/6.91.2; 536723.1. 25.4 

According to International Patent Classification (IPC) or to both national classifi cation and IPC 

B. FIELDS SEARCHED 

Minimum documentation searched (classification system followed by classification symbols) 

U.S. : 435/6, 91.2; 536723.1. 25.4 

Documentation searched other than minimum documentation to the extent that such documents arc included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 
APS, MEDLINE, BIOSIS 

search terms: deoxyribozyme. dna enzyme, catalytic dna, ribozyme, ligase 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


X 


YANG et al. Minimum ribonucleotide requirement for 


10 




catalysis by the RNA hammerhead domain. Biochemistry. 




A 


1992, Vol. 31, pages 5005-5009, especially page 5006. 


1-9 and 11-29 


X 


CHARTRAIMD et al. An oligodeoxyribonucleotide with 


10 




catalytic properties. Proc. RNA Soc. 1994, Vol. 77, page 5. 




A 




1-9 and 11-29 


X 


BREAKER et al. A DNA enzyme that cleaves RNA. Chem. 


10 




Biol. December 1994, Vol. 1, pages 223-229 




A 




1-9 and 11-29 



PA Further documents are listed in the continuation of Box C. Q See patent family annex. 



Special c 



■ of died <3 



doanncal defining the general Hate of the an which m not cooajdered 

tD be of B 



*E° earlic*docuxxKslpui>£ftbedcA or after (he sxtcromtioeaj film*, date X 

"L" document which may throw doubo on priority ebunfa) or which • 

cited to ratnblah the publication date of * mtfhcr "*»«?«n or other 
special reason (aa ipcci/Wxf) "V 

*0* dnnmnU referring to as oral Jacan—re, qk, exhibiti on or other 

P* document published prior to die intcmatjcmal filmr date but later than •*>* 



he bternalional ftlin* dale 
dale and not m conflict with the application bat ched to undenumd the 
principle or theory undertyinf the inveotioe 

of particular relevance; the cbnned mvenbea caooot be 
1 or canne d be coneaiered to involve an inventive dtp 



t of particular relevance; (fee churned invention cannot be 
cooajdered to^ involve ao mvcabvc atcp when the document n 

being obvioua to a peraoo a&Hhd b the art 

document member of Che tame palest famDy 



Date of the actual completion of the international search 
31 JULY 1996 


Date of mailipffof the international search report 


Name and mailing address of the ISA/US 
Conumssio&er of Patents and Trademarks 
Box PCT 

Washington. D.C. 2023! 
Facshnfle No. (703) 305-3230 


TOepltenervo: f703) 308-O196 /. 



Form PCT/IS A/210 (second sheetXiuly 1992)* 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/US96709358 



C (Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


X 

A 


B ARTEL et al. Isolation of new ribozymes from a large pool of 
random sequences. Science. 10 September 1993, Vol. 261, pages 
1411-1418, especially page 1412. 


12, 15, 21, 24- 
25, and 27 

1-11, 13-14, 16- 
20, 22-23, 26, 
and 28-29 



Form PCT/1S A/210 (continuation of second sheet )( July 1992)* 



